Amiga Developer CD 2.1

home *** CD-ROM | disk | FTP | other *** search

/ Amiga Developer CD 2.1 / Amiga Developer CD v2.1.iso / Reference / DevCon / Orlando_1993 / Devcon93.4 / Notes / DevCon-93 < prev next >

Wrap

Text File | 1993-01-08 | 20.9 KB | 448 lines

Advanced EXEC and CPU Issues As we have seen, the pace of technology changes are getting faster all the time. It is hoped that the Amiga and its applications will be able to keep up with these changes as best as possible. This however means that changes will need to take place in both the Amiga's OS and in application software. With the release of V37 and V39 EXEC, a number of new concepts have been advanced to a stage from which a number of new technologies can be lauched. In fact, a number of the functions in V37 EXEC are required in order to make the system work with 68040 CPUs in a consistant manner. (Namely the CacheControl(), CachePreDMA(), CachePostDMA(), CacheClearU(), and CacheClearE() calls) With V39, the addition of private memory pools and the clean up of the memory allocation routines has set the ground work for some more advanced memory systems. New for V39 For V39, I had some time to clean up some of the areas of EXEC that could not be fixed in some external manner. The major changes were in the semaphores, memory subsystems, and the ROM debugger. Semaphores are the key to making a multitasking system work as one system. EXEC has a very nice set of semaphore functions that are called SignalSemaphores. Before V39, these semaphores could only be used and accessed in a synchronous manner. That is, you made a function call that would block until you had obtained the semaphore. While this usually ends up being enough for most people, there are times when software may need to "bid" for a semaphore and go and do something until it obtains it. EXEC already had this concept in the Procure() and Vacate() functions but these functions were both broken and somewhat useless due to the fact that they worked with a different semaphore structure than the SignalSemaphores. As it turns out, I was able to reuse these two function calls and they now, when used with SignalSemaphores, work and are useful in that the same semaphore may be both synchronously and asyncronously obtained. See the AutoDocs on Procure() and Vacate() for more information as to how this works. One of the features of the Amiga has always been the dynamic nature of its use of memory. This has also been one of the trickiest parts of good Amiga programming. Applications would like to dynamically use memory and release it but with more than one running at the same time, memory got fragmented and performance suffered. For V39, two different parts were added to the memory subsystem; pools and memory handlers. Memory pools are a way to help combat memory fragmentation, increase the speed of the system, and give a simple way to keep associated memory together. Due to the fact that memory pools are "private" to the application, a number of performance benefits are obtained (including not neededing to go into Forbid() during allocation or deallocation from the pool). Since pools give the system a simple way to keep your allocations together it also gives the system a simple way to release your allocations by just destrying the pool as a whole. Also, the design was left back-box such that future system enhancements can be made to them without too many problems. (A very important point here is that pools will be the memory interface of choice in the future.) For more information on the memory pools, see the AutoDocs and the next AmigaMail. Memory handlers are an extension to the Amiga's physical memory manegment system. As you know, when a memory allocation fails, the system will first attempt to release any resources that are no longer in use and retry the allocation before it will truely fail the allocation. This design was very inovative when the Amiga first came out but it was not complete enough to let applications cache data as long as there was enough memory or to do other, more complex memory usage games. The memory handler system expands this feature and makes a number of performance problems much easier to deal with. (Such as resterized outline font caching or large database RAM caching). It also makes it possible for the caching code to know how much and of what type of memory the currently failing allocation is for such that it can more intelligently release memory without releasing everything. The following is a quick overview of the design goals behind the memory handler design. As a side benefit to this work, most memory allocations are over 100 cycles faster on a 68000 based machine since the overhead of RAMLIB's SetFunction() of AllocMem() is no longer an issue. ------------ Memory Handler - Quick Overview The basic design is a handler list that is called when a memory allocation fails. The handler list (just like input.device, etc) will contain routines that applicationss and libraries added. Each handler in the list will be called in order until the memory allocation works or the handler list is completed. Only after the handler list has been completely traversed will the allocation fail. The handler list is a standard exec-style list that is stored in priority order. RAMLIB, which currently SetFunctions the AllocMem() routine will no longer do this but rather add itself to the handler list at priority 0. This lets applications come before and after the RAMLIB expunge. ------------- New functions: There will need to be two new functions in EXEC to deal with the handler list. There will also be a new flag to AllocMem() The basic functions are: void AddMemHandler(struct Interrupt *) a1 void RemMemHandler(struct Interrupt *) a1 AddMemHandler() - This function will take the handler given and enqueue it onto the memory handler list. Once on the list, the handler must be ready to be called. This means that the handler must be ready to be called before this function even returns. RemMemHandler() - This function will remove the handler from the list. This function *CAN* be called while within the handler. A new memory flag, MEMF_NO_EXPUNGE, will be added to exec. This flag will cause the memory allocation attempt to fail without going through the memory handler. This is usefull for caching systems that may not really need the memory but will take it if available and also is required for use within the handler such that memory could be allocated during the expunge cycle. (Or at least attempted) This flag will be ignored in systems where there is no memory handler. BITDEF MEM,NO_EXPUNGE,31 ;AllocMem: Do not call expunge on failure ------------ The MemHandler structure: This structure is the data passed to a MemHandler. This structure is *READ ONLY* struct MemHandlerData { ULONG memh_RequestSize; /* Size of the requested allocation */ ULONG memh_RequestFlags; /* Flags of the requested allocation */ ULONG memh_Flags; /* Flags (see below) */ }; The memh_RequestSize and memh_RequestFlags are the size and flags arguments from the AllocMem() call that failed. BITDEF MEMH,RECYCLE,0 ; Recycle The MEMHF_RECYCLE flag is 0 if this was the first time this handler was called due to this allocation failure. If this is 1, the handler is being called again for the same failure. See below about handler return and recycling... ------------ The Handler: The protocal for a MemHandler must be strictly followed. Due to the fact that the handlers are being called on the AllocMem() context and the fact that AllocMem() *MUST* *NOT* break a Forbid(), the handler *MUST* *NOT* break a Forbid(). Another issue is stack usage. The handler could be running on any task in the system that calls AllocMem() For this reason, the handler must try to keep stack usage as low as possible. Exact stack usage is not available, but a good rule would be to keep it under 128 bytes if possible. The handler may call AllocMem() with the new MEMF_NO_EXPUNGE flag. This flag is new to the exec that has the memory handler system. Library expunge vectors can *not* make use of this feature. This flag would let a handler move memory from one location to another. For example, if the requested memory is for CHIP, the handler could move any of its CHIP allocations that it can to FAST memory (if possible) and would then be able to help satisfy the MEMF_CHIP request. Also caching systems may wish to only cache an item if memory is available and would not want to have the system do an expunge just to cache this "unimportant" item. The handler will be called in a Forbid() state that *MUST* *NOT* be broken. A handler can RemMemHandler() itself *ONLY* if it returns MEM_DID_NOTHING or MEM_ALL_DONE. The handler code, with is in (*is_Code)() of the interrupt structure will be called as follows: a0=Pointer to (struct MemHandler) a1=Value from is_Data a2=Pointer to the Interrupt structure for this handler a6=ExecBase The handler must follow the standard rules about register usage. Only d0, d1, a0, and a1 may be modified, all other registers *MUST* remain unchanged. Return results: d0= MEM_DID_NOTHING or MEM_ALL_DONE or MEM_TRY_AGAIN MEM_DID_NOTHING If the handler could not release any resources it should return with d0 set to this. MEM_ALL_DONE If the handler released all of its resources, it should return this in d0. MEM_TRY_AGAIN If the handler released some resources in hopes that it will have solved the memory problem it can return with this value. In that case, EXEC will retry the allocation and if it does not work, will call the handler again. Note that the handler can tell if it was already called by the MEMHF_FIRST_TIME flag which will be 0 if this is the first call to the handler. The main use of this return value is to help implement the RAMLIB handler but it could be usefull for LRU caching code or caching code that tries to defragment memory during expunge in order to try to satisfy the allocation request. ------------- RAMLIB: RAMLIB will, under this system, no longer setfunction the memory allocation routines but rather add a memory handler at priority 0. This handler would then be called when the allocation failed and RAMLIB could then call the library expunge vectors as it does today. If RAMLIB wishes to continue to do the 2.0 partial expunge, that would be possible with the MEM_TRY_AGAIN return value. ------------- Another key point in the design of V39 EXEC was to provide for a low-level debugging core that can be used to debug rather complex problems. This low-level debugger, the Simple Amiga Debugging kernel, SAD, replaces ROM-WACK from pre-V39 systems. One of the goals of SAD was to provide near emulator level access to debugging the Amiga. Due to some minor hardware issues, this was not 100% implemented. The goal was to use the unused NMI interrupt to trap into the SAD kernel and then let have the controlling systems talk to SAD and do whatever is needed of them. By default, due to hardware issues on certain Amiga models, SAD in not connected to the NMI vector. It is ready to be connected, but it is not. The Simple Amiga Debugging Kernel (SAD) is a set of very simple control routines stored in the Kickstart ROM that would let debuggers control the Amiga's development enviroment from the outside. These tools would make it possible to do remote machine development/debugging via just the on-board serial port. This set of control routines is very simple and yet completely flexible, thus making it possible to control the whole machine. Technical Issues SAD will make use of the motherboard serial port that exists in all Amiga systems. The connection via the serial port lets the system be able to execute SAD without needing any of the system software up and running. (SAD will play with the serial port directly) With some minor changes to the Amiga hardware, an NMI-like line could be hooked up to a pin on the serial port. This would let external control of the machine and would let the external controller stop the machine no matter what state it is in. (NMI is that way) In order to function correctly, SAD requires the some of the EXEC CPU control functions work and that ExecBase be valid. Beyond that, SAD does not require the OS to be running. Command Overview The basic commands needed to operate SAD are as follows: Read and Write memory as byte, word, and longword. Get the register frame address (contains all registers) JSR to Address Return to system operation (return from interrupt) These basic routines will let the system do whatever is needed. Since the JSR to address and memory read/write routines can be used to download small sections of code that could be used to do more complex things, this basic command set is thus flexible enough to even replace itself. Caches will automatically be flushed as needed after each write. (A call to CacheClearU() will be made after the write and before the command done sequence) Technical Command Descriptions Since the communications with SAD is via a serial port, data formats have been defined for minimum overhead while still giving reasonable data reliability. SAD will use the serial port at default 9600 baud but the external tools can change the serial port's data rate if it wishes. It would need to make sure that it will be able to reconnect. SAD sets the baud rate to 9600 each time it is entered. However, while within SAD, a simple command to write a WORD to the SERPER register would change the baud rate. This will remain in effect until you exit and re-enter SAD or until you change the register again. (This can be usefull if you need to transfer a large amount of data) All commands have a basic format that they will follow. All commands have both an ACK and a completion message. Basic command format is: SENDER: $AF <command byte> [<data bytes as needed by command>] Receive: Command ACK: $00 <command byte> Command Done: $1F <command byte> [<data if needed>] Waiting: $53 $41 $44 $BF Waiting when called from Debug(): $53 $41 $44 $3F Waiting when in dead-end crash: $53 $41 $44 $21 The data sequence will be that SAD will emit a $BF and then wait for a command. If no command is received within <2> seconds, it will emit $BF again and loop back. (This is the "heart beat" of SAD) When called from Debug() and not the NMI hook, SAD will use $3F as the "heart beat" If SAD does not get a responce after <10> heartbeats, it will return to the system. (Execute an RTS or RTE as needed) This is to prevent a full hang. The debugger at the other end can keep SAD happy by sending a NO-OP command. All I/O in SAD times out. During the transmition of a command, if more than 2 seconds pass between bytes of data SAD will time out and return to the prompt. This is mainly to help make sure that SAD can never get into an i-loop situation. Data Structure Issues While executing in SAD, you may have full access to machine from the CPU standpoint. However, this could also be a problem. It is important to understand that when entered via NMI that many system lists may be in unstable state. (NMI can happen in the middle of the AllocMem routine or task switch, etc) Also, since you are doing debugging, it is up to you to determin what operations can be done and what can not be done. A good example is that if you want to write a WORD or LONG that the address will need to be even on 68000 processors. Also, if you read or write memory that does not exist, you may get a bus error. Following system structures may require that you check the pointers at each step. When entered via Debug(), you are now running as a "task" so you will be able to assume some things about system structures. This means that you are not in supervisor state and that you can assume that the system is at least not between states. However, remember that since you are debugging the system, some bad code could cause data structures to be invalid. Again, standard debugging issues are in play. SAD just gives you the hooks to do whatever you need. Note: When SAD prompts with $BF you will be in full disable/forbid state. When $3F prompting, SAD will only do a Forbid(). It is possible for you to then disable interrupts as needed. This is done such that it is possible to "run" the system from SAD when called with Debug(). Data Frames and the Registers SAD generates a special data frame that can be used to read what registers contain and to change the contents of the registers. See the entry for GET_CONTEXT_FRAME for more details For more information on how SAD works, please check the EXEC AutoDocs. The Future Now that I have talked about the major changes to EXEC for V39, we should look into what the future may bring. What follows is a general description of the "vision" I have for the EXEC of the future. They do not mean that exactly these features or that all or only these features will be implemented. However, it does show you some of the directions that we hope to be able to push the operating system. One of the future features that is already somewhat in use is CPU-specific support libraries. Currently, there is a 68040.library which patches itself into EXEC and the system to provide the functions needed to make the 68040 based Amiga work. Future processors will also need such support libraries. As such, a goal will be to have a different library for each of the processor groups. This library would take care of things like handling of the various processor specific issues such as instruction emulation, cache control, MMU support, and other things that are system-level and CPU specific. (In other words, there will be a 68060.library and maybe even a 68030.library) Along with the CPU-based extentions, the much asked for and very much mis-understood feature of virtual memory would become an issue. The design (from the concept point of view) is rather far along at this point in time and now a number of changes to the memory system will make this possible. In order to prevent compatibility problems and other issues, the only way to obtain vertual memory would be to obtain a private pool with the attribute of PAGED memory. As a side issue, since it is only via private pools that such memory can be allocated, it may be possible to have a form of protected pools too. Object oriented programming has become a major "key" word in the market today. While much of the OOP hype is just that, there are a number of benefits that can be had in a system that supports object oriented features. The benefits however, are only available if there is a good base from which the objects are built and includes the core functions that makes up the basic OOP interfaces. For 2.0, Jim Mackraz implemented an object oriented gadget/image system for Intuition. Basic Object Oriented Programming System for Intuition or BOOPSI as we call it, was a very important improvement in the dealing with the details of user interface building and operation. The model was designed with those specific goals in mind and it added a great deal to the capability of Intuition and the user interfaces that can be built in Intuition. One need, however, is for objects that may not be user interface based or have any need for those aspects. Example objects would be a data retrieval object or maybe a network link object or even a "thread" object. In fact, given the correct core set of objects, a full resource tracking system can be built out of just having objects dispose of their parts when the "process" or "task" object is disposed. The object support that I envision would be very low overhead support for runtime "linked" objects. (Much like shared code libraries are runtime "linked") It would provide for both high-speed method inharitance and complex multiple inharitance. Disk loaded object classes and application embedded private object classes would both be supported. In addition, the long-awaited task-tree (child-tasking) support is once again on the list of things to do. This would give tasks the ability to be notified about their children tasks (or parent task). Some of this may be well suited to the object-based "task" or "thread" construct. Debugging support will also continue to get better. Both via tools such as Enforcer and via better support within the new constructs to check for and report invalid operations. The hardest part of the developing a major application is the varification of its quality. The system should be able to help here. Conclusion So, while much work needs to be done just to keep up with other aspects of the Amiga OS and the hardware (including new CPUs) there are a number of key issues/features that would make for an even better system. Some of these are good because they are great "PR" (everyone talks about getting VM, even the 1-floppy A1200 owner who does not even have a MMU) and others are just very useful for system construction and simplified application development. (Debugging software on such a complex system can be a pain)